AITopics

2606.28242

Country: North America > United States (0.14)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.64)

arXiv.org Machine LearningJun-25-2026

Latent Block-Diffusion Temporal Point Processes: A Semi-Autoregressive Framework for Asynchronous Event Sequence Generation

Zhang, Shuai, Chen, Yancheng, Zhou, Chuan, Liu, Yang, Lin, Xixun, Zhao, Xiangyu, Zhu, Jun, Ma, Zhi-Ming

Modeling and sampling from the underlying distribution of asynchronous event sequences are crucial in various real-world applications, including social networks, medical diagnosis, and financial transactions. Existing autoregressive methods suffer from error accumulation during multi-step generation, while non-autoregressive diffusion methods are typically limited to fixed-length output sequences. In this paper, we propose Latent Block-Diffusion Temporal Point Processes (LBDTPP), a novel semi-autoregressive TPP framework that introduces a latent block diffusion mechanism for high-quality and variable-length event sequence generation. The core idea is to define an autoregressive probability distribution over event blocks in latent space and perform Gaussian diffusion within each block. By sequentially generating blocks while simultaneously sampling events in each block, LBDTPP preserves the length flexibility of autoregressive TPPs and inherits the parallel high-quality generation capability of diffusion models. Theoretically, we derive Wasserstein error bounds showing that, under suitable local approximation and prefix-stability assumptions, block-wise generation can reduce error accumulation compared with event-wise autoregressive generation. Extensive experiments on six real-world benchmark datasets demonstrate that LBDTPP outperforms state-of-the-art TPP baselines in both unconditional and conditional generation tasks. Further empirical analyses verify the benefits of latent-space diffusion and block-wise generation, and reveal the trade-off between generation quality and block size. Our code is available at https://github.com/Zh-Shuai/LBDTPP.

data mining, machine learning, natural language, (19 more...)

2606.24982

Country: Asia > China (0.28)

Genre: Research Report (0.82)

Industry:

Education (0.68)
Health & Medicine (0.48)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Neural Information Processing SystemsJun-22-2026, 23:34:22 GMT

Supplementary Material for Quantifying Generalisation in Imitation Learning

Each split is entirely unique, and no labyrinth appears more than once in4 the entire dataset, regardless of its split. Each entry consists of five pieces of information:5 obs: the path to the image representation for that state;6 actions: the integer action performed for that solution in that state;7 rewards: the float reward received for that action in that state;8 episode_starts: the boolean status that states if it is the first state in an episode; and9 info: the textual information required to load the same labyrinth structure if needed.10 We provide images in each dataset since we believe that visual information is more useful to the11 imitation learning agent, and if a vector representation is needed, the info parameter allows researchers12 to load the same structure and the actions enables the recreation of the dataset in its vector format.13 Each observation is an image of size 600 600 3. Although each baseline trained in this work14 uses a 64 64 3input, we thought that providing a bigger image would benefit models requiring15 downsizing (e.g., the walls will not disappear during resizing).

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Genre: Research Report (0.94)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)

Neural Information Processing SystemsJun-18-2026, 01:52:50 GMT

ProxySPEX: Inference-Efficient Interpretability via Sparse Feature Interactions in LLMs

Large Language Models (LLMs) have achieved remarkable performance by capturing complex interactions between input features. To identify these interactions, most existing approaches require enumerating all possible combinations of features up to a given order, causing them to scale poorly with the number of inputs n. Recently, Kang et al. (2025) proposed SPEX, an information-theoretic approach that uses interaction sparsity to scale to n 103 features. SPEX greatly improves upon prior methods but requires tens of thousands of model inferences, which can be prohibitive for large models. In this paper, we observe that LLM feature interactions are often hierarchical--higher-order interactions are accompanied by their lower-order subsets--which enables more efficient discovery.

justification, large language model, natural language, (13 more...)

Country: North America > United States (0.46)

Industry: Health & Medicine (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Neural Information Processing SystemsJun-17-2026, 14:32:03 GMT

Short-length Adversarial Training Helps LLMs Defend Long-length Jailbreak Attacks: Theoretical and Empirical Evidence

Jailbreak attacks against large language models (LLMs) aim to induce harmful behaviors in LLMs through carefully crafted adversarial prompts. To mitigate attacks, one way is to perform adversarial training (AT)-based alignment, i.e., training LLMs on some of the most adversarial prompts to help them learn how to behave safely under attacks. During AT, the length of adversarial prompts plays a critical role in the robustness of aligned LLMs. While long-length adversarial prompts during AT might lead to strong LLM robustness, their synthesis however is very resource-consuming, which may limit the application of LLMAT. This paper focuses on adversarial suffix jailbreak attacks and unveils that to defend against a jailbreak attack with an adversarial suffix of length Θ(M), it is enough to align LLMs on prompts with adversarial suffixes of length Θ( M).

large language model, machine learning, natural language, (18 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Education > Educational Setting > Online (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Neural Information Processing SystemsJun-15-2026, 09:12:42 GMT

Predictability Enables of Nonlinear State Space Models

The rise of parallel computing hardware has made it increasingly important to understand which nonlinear state space models can be efficiently parallelized.

artificial intelligence, information processing system, machine learning, (16 more...)

Country:

North America > United States (0.28)
North America > Canada (0.28)
Europe > United Kingdom > England (0.28)
Europe > Germany (0.28)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Machine LearningJun-12-2026

Epistemic Uncertainty Is Not the Reducible Kind

Young, Robin

The standard taxonomy of predictive uncertainty defines epistemic uncertainty as the part removable by collecting more data, while the standard measure identifies it with a mutual-information term. We prove the definition and the measure are extensionally inconsistent. On an explicit construction, the measure assigns all uncertainty to the epistemic class, yet no quantity of training data reduces it. Reducibility is instead a property of the pair (uncertainty, acquisition class), and the dichotomy resolves into three parts: aleatoric, sample-reducible epistemic, and mechanism-reducible epistemic uncertainty. An exact identity for the value of an observation shows that in-distribution data never reduces mechanism-irreducible uncertainty and generically increases it. Ensemble disagreement, the deployed epistemic estimate, tracks the training procedure rather than the epistemic term. It collapses to zero beneath a positive truth under consistent training, and equals hyperparameter-scaled initialization noise under interpolation. A finite-sample falsification test and seed-swept experiments confirm the theory.

artificial intelligence, construction, machine learning, (16 more...)

2606.12646

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

arXiv.org Machine LearningMay-20-2026

Latent Laplace Diffusion for Irregular Multivariate Time Series

You, Zinuo, Zheng, Jin, Cartlidge, John

Irregular multivariate time series impose a trade-off for long-horizon forecasting: discrete methods can distort temporal structure via re-gridding, while continuous-time models often require sequential solvers prone to drift. To bridge this gap, we present Latent Laplace Diffusion (LLapDiff), a generative framework that models the target as a low-dimensional latent trajectory, enabling horizon-wide generation without step-by-step integration over physical time. We guide the reverse process utilizing a stable modal parameterization motivated by stochastic port-Hamiltonian dynamics, and parameterize its mean evolution in the Laplace domain via learnable complex-conjugate poles, enabling direct evaluation over irregular timestamps. We also link continuous dynamics to irregular observations through renewal-averaging analysis, which maps sampling gaps to effective event-domain poles and motivates a gap-aware history summarizer. Extensive experiments show that LLapDiff improves over baselines in long-horizon forecasting, and its continuous-time generative nature supports missing-value imputation by querying the same model at historical timestamps. Code is available at https://github.com/pixelhero98/LLapDiffusion.

artificial intelligence, data mining, machine learning, (14 more...)

2605.19805

Country:

Asia (0.28)
North America > United States (0.14)

Genre: Research Report (0.40)

Industry: Banking & Finance > Trading (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Data Science > Data Mining (0.88)

arXiv.org Machine LearningMay-13-2026

Learning U-Statistics with Active Inference

Wang, Xiaoning, Huo, Yuyang, Peng, Liuhua, Zou, Changliang

$U$-statistics play a central role in statistical inference. In many modern applications, however, acquiring the labels required for $U$-statistics is costly. Motivated by recent advances in active inference, we develop an active inference framework for $U$-statistics that selectively queries informative labels to improve estimation efficiency under a fixed labeling budget, while preserving valid statistical inference. Our approach is built on the augmented inverse probability weighting $U$-statistic, which is designed to incorporate the sampling rule and machine learning predictions. We characterize the optimal sampling rule that minimizes its variance and design practical sampling strategies. We further extend the framework to $U$-statistic-based empirical risk minimization. Experiments on real datasets demonstrate substantial gains in estimation efficiency over baseline methods, while maintaining target coverage.

machine learning, natural language, u-statistics, (14 more...)

2605.11638

Country: Asia (0.68)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Fang, Yu-Hsueh, Lee, Chia-Yen

Optimal Spatio-Temporal Decoupling for Bayesian Conformal Prediction

arXiv.org Machine LearningMay-4-2026

Online Conformal Prediction (CP) struggles to balance temporal adaptability and structural stability. Feedback-driven methods (e.g., Adaptive Conformal Inference (ACI)) suffer from systemic marginal under-coverage and high interval variance during abrupt shifts, while temporally discounted Bayesian CP suffers from severe structural lag and uncalibrated interval bloat. We propose State-Adaptive Bayesian Conformal Prediction (SA-BCP) to achieve optimal spatio-temporal decoupling. By gating long-term temporal inertia with spatial kernel-density evidence, SA-BCP proactively expands intervals for recognized historical regimes while maintaining tight efficiency during stable states. We rigorously prove this mechanism's optimality, identifying a minimax bias-variance tradeoff governed by an evidence threshold $K$. Extensive benchmarks on volatile financial datasets (2016--2026), including AMD, Gold, and GBP/USD, demonstrate that SA-BCP consistently minimizes the strictly proper Winkler score across diverse confidence levels. Specifically, SA-BCP resolves the systematic under-coverage inherent to ACI variants while simultaneously reducing the uncalibrated interval bloat of Bayesian CP by 10\% to 37\% under high-confidence requests. By elegantly navigating this tradeoff, SA-BCP achieves an optimal balance between conditional reliability and predictive efficiency.

artificial intelligence, machine learning, sa-bcp, (19 more...)

2605.00432

Country: Asia > Taiwan (0.14)

Genre: Research Report (0.50)

Industry: Banking & Finance (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)